Newton Revisited: An excursion in Euclidean 

geometry 

o 

Greg Markowsky 

O October 26, 2009 

(N 



o 



Abstract 



This paper discusses the relationship between Kepler's Laws and 
Euclidean geometry. Many of the theorems are from Principia by 
_G Isaac Newton, but a more modern manner of presentation is adopted. 

s 

1 Introduction 

> The goal of this paper is to derive Kepler's Laws of Planetary Motion from the 

Law of Universal Gravitation using purely geometric methods. My motives 
00 for this are entirely aesthetic. There's no question that modern calculus 

gets the job done, but there is a certain thrill that comes from old-fashioned 
Euclidean geometry, with its similar triangles and tangent lines. Except for a 
small number of instances, the reader is encouraged to forget all the calculus 
they know in order to better enjoy the mathematics. Since this paper deals 
with results that are known to be true with complete rigor, the style here will 
be quite informal and nonrigorous. As my advisor used to say, we're going 
to play fast and loose. 

The idea for the paper came about by reading two books. The identity 
of the first, Principia by Isaac Newton, should be obvious. Newton's master- 
piece was an inspiration to me when I discovered it in graduate school, and 
is a feast for geometry lovers. Unfortunately, Principia presents the modern 
reader with a few obstacles that get in the way of the beautiful mathematics. 
To begin with, the style is difficult and unfamiliar, and the translations that 
I have seen contain words which are not in common use these days. Second, 
Newton assumed many theorems about conies which apparently were well 
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known to mathematicians of his day. Sadly, studying the conies has fallen 
out of favor a bit in the time since then, so that even professional mathe- 
maticians may have to do a bit of work on their own to make it through 
Principia, as I did. 

Given these difficulties, it is natural to attempt to "translate" Newton's 
work into modern notation, with background material supplied where neces- 
sary. The most notable recent attempt at this that I know of was by Richard 
Feynman, and presented in a lecture to students at the California Institute 
of Technology(in fact, the core of the argument that Feynman used had al- 
ready been discovered by the great James Maxwell, and was published in 
[3], though Feynman was probably unaware of it). Feynman's lecture has 
survived in the form of p]. This is a very enjoyable book, and Feynman's 
discussion is ingenious, but it fell short of real satisfaction for me for two 
reasons. First of all, Feynman does not derive Kepler's Third Law in its en- 
tirety from Newton's laws. Perhaps his methods could lead to a derivation, 
but it isn't mentioned in the book. Secondly, and more seriously, Feynman 
went out of his way to avoid dealing too much with the geometrical prop- 
erties of the conies. Certainly a person has the right to dislike the conies if 
they choose, but I have always ascribed more to the Archimedean school of 
thought which contends that the greatest joy in physics lies in the wonderful 
geometrical problems that arise. In other words, the motion of the planets 
around the sun gives us a great excuse to study the conies. 

This paper, then, is a record of my attempt to understand Newton's 
work on planetary orbits. My argument departs from his at some point, but 
up to that point the paper is largely just a retelling of selected pieces of 
Principia. The next section is introductory material on conies. The third 
section follows Newton's work, with auxiliary lemmas added where needed. 
The first theorem in the fourth section is also from Principia, while the rest 
of the section gives the way I came up with to deduce Kepler's first and third 
laws. This part of the paper can be considered original, to my knowledge. 
The fifth section gives a solution to a natural problem in mechanics which 
is found to be extremely simple using the methods in the fourth section. If 
the reader can obtain from this paper 10% of the enjoyment that I felt while 
studying Principia, they can consider it time well spent. 



2 



2 Conies 



It is expected that the reader has some familiarity with the conies. For the 
sake of completeness, however, I have included essentially everything relevant 
below. A reader with a good working knowledge of the conies can safely skim 
this section. Before we look at the conies, I should mention some possibly 
nonstandard bits of terminology that I will use. The measure of an arc on a 
circle is defined to be the magnitude of the angle it subtends at the center 
of the circle. For example, the measure of arc AB (abbreviated as m(AB)) 
below is 35°. 




Furthermore, when an angle inside a circle subtends a pair of arcs in both 
directions, we say that the angle covers the arcs. For example, in the picture 
below, angle a covers arcs AB and CD. 
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For reasons which I am at a loss to explain, the following theorem is rarely 
given in full generality when it is presented in high schools. 

Theorem 1 Suppose that two lines intersect at an angle a either inside or 
on a circle. Let AB and CD be the arcs covered by a. Then 

(2.1) m(AB) + m(CD) = 2a 

Proof: Draw the lines parallel to the original two lines but which pass 
through the center of the circle O. Let A', B', C, D' be the points of inter- 
section of these new lines with the circle, as shown below. 



We have 



(2.2) m{AB) + m(CD) = m(AB') + m(C'D) = m(A'B') + m(C'D') 

As m(A'B') + m(C'D') = 2a by definition, we are done. □ 

Note that this proof works just as well if the angle lies on the circle, and 
includes the case where one of the lines is a tangent to the circle. 

The tangent to a curve more general than a circle must be defined. Let 
O and P be two points on a curve which are close to each other, and let P 
be fixed. Draw the line containing both O and P. As we let O approach P, 
if the line containing O and P gets closer and closer to a fixed line T, we 
define T to be the tangent at P. 




We aren't going to worry about whether such a T exists, we'll just assume it 
does(as it does) for the curves we care about. 

Finally, I will write things like "the slope of the cone" and "the slope 
of the plane". These are not standard terminology, but should cause no 
confusion. Given a cone, orient it so that the vertex is pointing directly up. 
The vertical line through the vertex is the axis of the cone, and we define the 
slope of the cone to be the ratio ^ below. Note that O is the center of the 
base, and A is a point on the outside of the base. 
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To define the slope of a plane P, begin by drawing a vertical line L passing 
through the plane at a point P. Choose a point O on L but not on the plane, 
and let A be the closest point on the plane so that OA is perpendicular to 
L. Then the slope of the plane is defined to be 9M. 




The following lemma will help in dealing with tangents. 



Lemma 1 Suppose that a circle is tangent to a pair of lines with points of 
tangency a and b, and that the pair of lines meet at n. Let o be the center of 
the circle. Then ad + bo is parallel to no. 
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Proof: The proof is immediate, since the entire picture is symmetric 
around the angle bisector no. □ 

Suppose we have two circles of different radii with the smaller contained 
in the larger. 



We are allowing in this setup that the smaller is internally tangent at a 
point to the larger. Let E be the set of all points which are centers of circles 
tangent to both of our original circles. Then E is a curve, and this curve is 
an ellipse. 
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The two centers of the original circles are known as the foci of the ellipse. 
The following proposition gives the property of ellipses which is usually used 
to characterize them. 



Proposition 1 The sum of the distances from the foci of an ellipse to any 
point on the ellipse is equal to the sum of the radii of the two original circles. 

Proof: Examine the picture below. 




Fi and F 2 are the centers of the two fixed circles of radius r\ and r 2 , and 
O is a point on the ellipse whose corresponding circle has radius r 3 . We see 
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that F 2 = r 2 — r 3 , and FiO = T\ + r 3 . Thus, F 2 + F x = r x + r 2 , a 
constant. □ 



And now we have the all-important reflection property. 



Proposition 2 A beam of light fired from one vertex which reflects off the 
ellipse will strike the other vertex. In other words, LF\OC = LF 2 OD. 




Proof: Draw the circle with center O which is tangent to the two original 
circles. If point O moves infinitesimally along the ellipse, the circle with O at 
the center will expand or contract. Since we are only moving infinitesimally, 
we can replace the two original circles with their tangents at points A and B 
and apply Lemma [T} These tangents are also tangent to the circle with O at 
the center, and are therefore perpendicular to A (5 and B(5. Applying Lemma 
[lj we conclude that a tangent vector to the ellipse at O is given by AO + B(5, 
and thus the tangent bisects LBOA so that LBOC = LAOC = LF 2 OD □ 
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Remarks: i) If the two foci coincide, then the curve is a circle, and we 
see that circles are just special cases of ellipses. 

ii) The property of ellipses given in Proposition [T] is often given as the 
defining property of ellipses. It is left to the reader to show that any curve 
that satisfies Proposition [T] can be created by the construction that we have 
used to define ellipses. 

iii) All ellipses besides circles can be created by our construction where the 
smaller circle is internally tangent to the larger, and it is generally simpler 
to assume that the two circles are tangent. We didn't want to to assume 
that here, though, as we want to include circles. For the remaining conies 
we lose no generality in assuming tangency, and we will do so, though it is 
not necessary. 

Suppose now we have two circles of different radii which are externally 
tangent. Let H be the set of all points which are centers of circles which are 
externally tangent to both circles, if is a curve, and this curve is known as 
a hyperbola. The centers of the original circles, F\ and F 2 , are known as the 
foci of the hyperbola 
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Proposition 3 The difference between the distances from the foci to any 
point on the hyperbola is a constant. 

Proof: Let r-y, r 2 , and r Q be the radii of the circles centered at Fi ) F 2) 
and O in the above picture. Then 

(2.3) F 2 - F x O = (r 2 + r a ) - (n + r ) = r 2 - n 

This is a constant, so we are done. □ 

As with the ellipse, we have a very pretty reflection property. 

Proposition 4 Suppose we fire a beam of light from infinity (in other words, 
from outside the picture ) at one of the foci. If the beam strikes the hyperbola 
before reaching the focus, it will reflect off the hyperbola and strike the other 
focus. That is, LF 2 OY = looOX. 
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Proof: Pick a point O on the hyperbola, and draw all relevant circles. 
We must show LF\OY = LF 2 OY . But this follows directly from Lemma [I] 
as in Proposition [2j □ 




Remarks: i) If the two original circles have the same radii, the hyperbola 
reduces to a straight line. 

ii) In most books on conic sections, the hyperbola consists of two parts, 
one such as we have described, the other the mirror image reflected around 
the midpoint of FiF 2 . This mirror image will be obtained by the same 
construction, interchanging the radii of the circles centered at F\ and F 2 . 
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The last conic section is the parabola. Suppose that we have a circle of 
radius R tangent to a straight line. Let P be the set of all points which are 
the centers of circles tangent both to the line and externally to the circle. P 
is a curve, known as the parabola. The center of the original circle is called 
the focus. 




Displacing the original line R units down forms a new line L, known as 
the directrix. 

Proposition 5 The distance from any point on the parabola to the focus is 
equal to the distance from that point to the directrix. 
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Proof: Let O be a point on the parabola, and let the corresponding circle 
have radius Ro- Then the distance from the focus to O is R + R a , as is the 
distance from the directrix to O. □ 



Finally we have the reflection property, the one that Archimedes is re- 
puted to have used to torch attacking Roman ships. 

Proposition 6 A beam of light coming straight down will strike the parabola, 
reflect off, and hit the focus. In other words, in the picture below Lqpy = 
Lopx. 

Proof: Using Lemma [T] as in Proposition [2j LxOa = IxOb = LqOy. □ 




A plane which cuts through a cone without touching the vertex gives 
one of the conic sections. The slope of the plane determines which conic we 
obtain. A plane whose slope is less than the slope of the cone produces an 
ellipse. 
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A plane with a greater slope than the slope of the cone gives a hyperbola. 




A plane whose slope is the same as the slope of the cone gives a parabola. 
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Let us examine why this is so, beginning with the ellipse. Suppose that 
we have a sphere S and a point O away from the sphere in three dimensions. 
Then the set of all rays beginning at the point which are tangent to the 
sphere will form a cone, and we will say that the sphere is inscribed in the 
cone. The set of points on the surface of S which are touched by the cone 
will form the circle of tangency. The points on this circle are equally distant 
from O, since rotating the entire picture around a vertical axis through O by 
any angle does not change the picture. 
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Suppose we slice a cone with a plane P as shown below. Let a small sphere 
be inscribed in the cone above the ellipse, and a large sphere inscribed below 
the ellipse. Expand the small sphere while keeping it inscribed in the cone 
until it is tangent to P at point a. Contract the large sphere in the same 
way until it is tangent to P at b. The result is that a and b are the foci of 
the ellipse. To see this, let q be a point on the ellipse. Let C\ and C2 be the 
circles of tangency of the two spheres, and let p\ and P2 be the points on C\ 
and C 2 closest to q, i.e. such that P2IV1O is a straight line. Since qb and qp 2 
are both tangent to the larger sphere, their lengths are equal. The same is 
true for qa and qp\. Thus, qb + qa = P1P2, which is a constant independent 
of the choice of q. We see that the curve formed is indeed an ellipse, with a 
and b as its foci. 



17 



Now let's do something similar with the hyperbola. We need to extend 
the rays through O to form another cone above. Intersect a plane P with 
the cone to form a curve H, as shown below. Start with two small inscribed 
spheres above and below O, and expand them until they are tangent to P 
at points a and b. Let C\ and C2 be the circles of tangency of the spheres. 
Choose a point q on H, and let p± and P2 be the points on C\ and C2 so 
that qp20pi is a straight line. Since qb = qp2 and qa = qp±, we see that 
qa — qb = P1P2, which is a constant. Thus, H is a hyperbola, with a and b 
the foci. 
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Now for the parabola. We let P be a plane with the same slope as the 
cone, as shown below. Expand a small inscribed sphere near O until it 
becomes tangent to P at a. Let C be the circle of tangency of this sphere, 
and draw a line L on P at the same height as C . If q is a point on the curve, 
qa = qp, where p is the point on C such that qpO is a straight line. Drop a 
perpendicular to line L from q to point I. Since P is at the same angle to the 
vertical as the side of the cone and C is at the same height of L, qp = ql, so 
that qa = qp. We see that the curve is a parabola with vertex a and directrix 
L. 
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Before we move on, let us notice one more thing about the parabola 
example. Let us keep everything as it was, but change the angle of the plane 
P. We now have an ellipse or a hyperbola, and it is no longer true that 
qa = ql. However, it is true that ^ = ® is a constant, since it can be 
expressed as the ratio of the slope of the plane and the slope of the cone. We 
obtain a new description of the conies. 

Theorem 2 Let L be a line and a a point in the plane. The conies can be 
realized as the set of all points p such that ^ = e, where I is the point closest 
to p on L, and e > is a constant. If e < 1, the conic is an ellipse. If e — 1, 
the conic is a parabola. If e > 1, the conic is a hyperbola. 
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We refer to L as the directrix, as has already been mentioned with the 
parabola. Let's rip through a few more propositions on conies. 



Proposition 7 Let C be a conic with directrix L, focus a, and eccentricity 
e. Choose b and c on C such that bac is parallel to L. Let p be any point on 
the conic, and let r be the length of ap. Let 9 be the angle of ap above ac. 
Then 



e(aO) 



(2 ' 4 > " l-esin* 

where O is the point on L closest to a. 
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b b 

L 1 

Proof: Drop perpendiculars from p and a to L, meeting L at I and O. 
Then pi = aO + r sin 9. We have 



(2.5) 



pa 
pi 



aO + r sin 9 



This last equation can be converted to (2.4) by algebra. 



□ 



The following is a beautiful little fact which may not be well known, 
although it appears in several older books on conies. 



Proposition 8 Let C be a conic with directrix L and focus a. Let p be 
a point on C , and let T be the tangent to C at p. Let I be the point of 
intersection ofT and L. Then Lpal is a right angle 
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Proof: Let q be a point close to p, as shown below. Extend pa and qa 
to points p' and q' on C, and drop perpendiculars from p and g to l p and / g 
on L. Extend pq to meet L at Z, and drop perpendiculars from / to p and g 
on pp' and qq' . 
c 





















p \^ 






q' ^ 


\ \ \ / 

v \ \ / 

V 








Li 



The first order of business is to show that Lq'al 
following string of equalities. 



Lpal. We have the 
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AZap _lp _plp _ op 
Alaq Iq ql q aq 

The first equality is because Alap and A/ag share a common height to vertex 
a, so 1^ is equal to the ratio of their bases. The second is because Alpl p ~ 
Alql q . The third is due to Theorem [2j However, Alap = (l/2)(ap)(7p), 
and Alaq = (l/2)(ag)(/g). We conclude that Ip = Iq, and it follows that 
Alap ~ Alaq. Thus, Lq'al = Lpal. Now, hold p fixed and let q approach p. 
Then qpl becomes the tangent at p, and Lpaq' becomes a 180° angle. As this 
happens, Lpal becomes a right angle, and we are done. □ 

The next theorem doesn't have a particularly exciting statement, but it 
will be crucial when we begin calculating the orbits of planets later on. 

Theorem 3 Let C be a conic, with focus a, directrix L and point O chosen 
on L so that aO and L are perpendicular. Let p be any point on C , let r be 
the length of ap, and let a be the angle made by ap and the tangent to C at 
p. 

(2.7) esc 2 a = %f^r 2 + r 

e 2 (aO) 2 e(aU) 

where e is the eccentricity of C . 

Remark: Note that there are two possibilities to choose from for angle 
a. However, if we label them ai,«2, we see a.\ + «2 = 180°, so that cscai = 
esc Q.2- In other words, it doesn't matter which one we choose. 

Proof: Choose a as shown below, and let / be the intersection of the 
tangent at p with L. Then Lpal is a right angle, by Theorem [8] Drop a 
perpendicular from a to point O on L. It is clear from the picture that 
aO = r cos 9 tan a 
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r tan a cos 







Squaring this equation gives 
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Recall that 



(2.9) 



(aO) 2 = r 2 cos 2 8 tan 2 a 
e(aO) 



1 — e sin 8 

by Theorem [7} Rearranging this and squaring gives 

'r-e(aO)) 2 



(2.10) 
Thus, 



(2.11) 



sin 2 8 



cos 2 8 = 1- sin 2 8 



r 2 e 2 



r 2 e 2 — (r — e(aO)) J 

iy 2 g2 



Plugging this into (2.8) and rearranging gives 

{aOfe 2 



(2.12) 
Thus 



tan 2 a 



r 2 e 2 _ ( r _ e ( a (9))i 
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(2.13) 




- (r-e(aO)) 2 
(aO) 2 e 2 



r 2 (e 2 - 1) 2r 



e 2 (aO) 2 + e{aO) 



- 1 



Adding 1 to both sides and using the identity esc 2 a = 1 + cot 2 a completes 



One last proposition about ellipses. Chords of conies are line segments 
connecting two points on the conic. The major axis of an ellipse is the chord 
passing through the two foci, and the minor axis is the chord contained in 
the perpendicular bisector of the major axis. 

Proposition 9 Let E be an ellipse with eccentricity e, focus a, directrix L 
and point O chosen on L so that aO and L are perpendicular. Let Q be the 
center of the ellipse, and let X = QB , Y = QC be the major and minor axes 
of the ellipse, respectively. Let G be the area of the ellipse. Then 



the proof. 



□ 



(2.14) 



VX 2 - Y 2 
X 



(2.15) 



(aO) 



Y 2 



(2.16) 



X 



VX 2 - Y 2 

(aO)e 
1-e 2 



(2.17) 



Y = 



(aO)e 



VT^ 2 
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Proof: These relations can be worked out by straightforward but unin- 
spired calculations in the plane. Happily, there is an inspired way to do it if 
we jump to three dimensions. We will consider an ellipse as the intersection 
of a cylinder of radius Z with a plane P. The same argument as with the 
cone shows that the resulting curve is an ellipse; alternatively, we can con- 
sider a cylinder as the limiting case of a cone as we let the height go to oo 
while keeping the base fixed. 



27 



Inscribe a sphere in the cylinder above P, then slide it down until it is 
tangent to P at one point. By the same argument as was used to prove 
Theorem [2j this point is the focus of the ellipse, a. Furthermore, L is the line 
on P which is the same height as the center of the sphere, since the circle 
of tangency of the sphere is the circle of that same height. Let us view the 
entire setup as a cross section from the side. 



\D 


R 


S 1 




L 








IT 
















T, 













B' 
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We have 



aB RB TB VX 2 - Y 2 
(218) e = OB = OB = BB' = ^T- 

The first equality is the definition of e, the second is by the equality of 
tangents from a point (B) to a circle, the third is because triangles BRO 
and BTB' are similar, and the fourth is because BB' = 2X, B'T = 2Y, 
and therefore BT = 2\/X 2 — Y 2 . Furthermore, triangles OaS and B'TB are 
similar, so that 

(2 19) aO = B>T = Y 

V ' ; Sa TB ^x 2 -Y 2 

As Sa = Y, we see that aO = ^p=p • We have established the first two 
relations. The final two are easy from this point. 

f2 20) {a ° )e - (rVX) - X 

1 Uj l-e 2 ~ 1 — (X 2 — Y 2 )/X 2 ~ 

(221) {a ° )e - (rVX) -Y 

VI -e 2 y/l-(X 2 -Y 2 )/X 2 

□ 



3 Kepler, Newton, and experimental data 

Having spent some time in the land of Euclidean geometry, let's return for 
a bit to the real world. In the early 1600 's, Johannes Kepler observed the 
following rules, which are now famous. 

1. The planets (including the earth) revolve in ellipses about the sun, with 
the sun at one of the foci of the ellipse. 

2. A line from the sun to a planet sweeps out equal areas in equal times. 
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3. The time it took for a planet to revolve once around the sun is pro- 
portional to the length of the major axis of the ellipse of revolution raised to 
the 3/2 power. 

Enter Newton. Newton made a few assumptions, validated through ex- 
periments, and proceeded to derive Kepler's laws, or at least come close (there 
is some debate on whether Newton had something akin to uniqueness for his 
orbits; see [6]). We won't need to follow Newton's assumptions to the letter, 
so we'll make our own. First, let us assume that an object at rest will tend 
to stay at rest, and any moving object will tend to continue in a straight line 
with a constant velocity. A change in velocity only happens when the object 
is acted upon by another object in some way. Such a change could occur due 
to a collision, but there are other ways as well. This property of matter is 
generally known as inertia, and was first proposed by Galileo. To describe 
our next assumption, suppose that an object O is moving, and that in the 
absence of forces acting on it O would move like so in some amount of time: 








A 



Suppose in addition that a force acts on O, and were this force to act 
upon O at rest O then in the same amount of time O would move like so: 
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Then the actual motion of O will be as Oc below: 



B 








C 



A 



Furthermore, the same is true if a larger number of forces acts on an 
object. In that case we add the vectors corresponding to each force, and the 
result gives the movement of the object. Now, let us see what happens if we 
suppose that an object moves according to these assumptions subject only to 
some force which is always directed at another object which does not move. 



31 



Theorem 4 Let an object O move through space subject only to forces di- 
rected at a stationary object S. Then line segments connecting O and S will 
sweep out equal areas in equal times. 




Proof: Let O begin at point A, and over some small period of time t 
move to point B. Then, by the law of inertia, if O were not acted upon by 
any force it would continue on in a straight line, arriving at c at time 2t. 
However, at point B it is affected by a "single but great impulse" (Newton's 
words) directed at S. Let this impulse be equal in magnitude to BV . To find 
the actual movement of O we complete the parallelogram BVCc, and we see 
that O resides at point C at time It. Again, in the absence of force O would 
move to point d at time St, but we will assume that O is affected by the im- 
pulse CW at point C, so that in fact O resides at D at time St. Now, we must 
show that Area(ASAB) = Area(ASBC) = Area(ASCD). Note first that 
Area(ASAB) = Area(ASBc), since the bases AB and Be of the two trian- 
gles are equal, and their common altitude is the perpendicular dropped from 
S to the line containing AB. Furthermore, Area(ASBc) = Area(ASBC), 
since these two triangles share the base SB and have equal altitudes due to 
the fact that Cc is parallel to SB. Thus, Area(ASAB) = Area(ASBC). 
The same argument shows that Area(ASBC) = Area(ASCD), and there- 
fore ASAB, ASBC, and ASCD all have the same area. 

Now, in general, a force will act continuously and not with isolated im- 
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pulses. However, if we let the number of triangles in the above argument 
increase to a very large number we will obtain an excellent approximation to 
the true path, and in all cases equal areas will be swept out in equal times. 
Letting the number of triangles go to infinity, we obtain a smooth curve in 
which equal areas are swept out in equal times, and this gives us the general 
case. □ 

To steal a line from Richard Feynman (unrelated to this discussion), if a 
person cannot see the connection between this theorem and Kepler's second 
law, then they have no soul. We have no choice but to at least guess that the 
planets move in their orbits due to an acceleration which is always directed 
at the sun. The following is a corollary that will be somewhat easier to apply. 

Corollary 1. Let an object O move through space subject only to forces 
directed at a stationary object S . Then there is a constant k such that, for 
any two points P and Q on the orbit, we have 



(3.1) T = kA 

where T is the time it takes the object to move from point P to point Q, and 
A is the area of sector SPQ. 

Proof: Let A' be the total area of the orbit, let T' be the time it takes 
the object to complete one orbit, and let k — If we divide the orbit into 
iV pieces of time then they must each contain equal areas by Theorem 4j 



and this area must be 4?. Thus, we see that (3.1) is satisfied for any time 
interval which is a rational multiple of the period of the orbit. An arbitrary 
interval can be approximated as closely as we please by a rational interval, 
so the result follows. □ 



Let us consider, now, how an object at rest moves under the effect of a 
constant continuous force. 

Theorem 5 Suppose an object P at rest at time experiences a constant 
acceleration a. Then the displacement from the initial position at time T is 
equal to (l/2)aT 2 . 

Proof: The velocity at time t is given by at. Divide the interval [0,T] 
into N equal intervals, [0,T/N], [T/N, 2T/N), ... , [(N - 1)T/N,T]. Let 
us assume as an approximation that the velocity over the interval [(n — 
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l)T/N,nT/N] is anT/N. Thus, the distance that P covers over the time 
interval [(n-l)T/N, nT/N] is {anT / N)(T / N) . Adding the distance for each 
of these intervals, we get 



(3.2) 



aT 2 



(1 + 2 + ... + N) 



aT 2 



N(N + 1) 
2N 2 



N 2 



As N — > oo, we obtain a better and better approximation, and 
We conclude that the distance covered at time T must be 9 ^~. 



2N 



□ 



0. 



Now that we have this theorem, let us turn our attention to the force 
directed at the sun which we theorize keeps the planets in their orbits. To 
begin with, it is a logical guess that the strength of the acceleration a at any 
point P in space depends only on the distance from P to the sun, and not 
on the direction of P from the sun. We'll assume that from now on. We 
will use the notation A ~ B for two varying quantities A and B to denote 
the property that ^ = M for some constant M. Since elliptical orbits are 
too complicated to deal with without warming up a bit, let's simplify by 
considering only circular orbits. 

Theorem 6 Suppose that an object P can be kept in uniform circular motion 
at any radius by an acceleration a directed at an immovable object S which 
lies at the center of the orbit. Let Ir be the amount of time P takes to 
complete an orbit of radius R around S. Suppose that I R ~ R 3 ^ 2 . Then 



Proof: An orbit of radius R has circumference 2nR, so if v is the velocity 

of the object in this orbit then I R = Thus, ^ ~ R 3/2 , i.e. v h. 

Let P be at the top of the orbit at a certain time, and at point b a small 
amount of time t later. Drop a perpendicular from b to c on the tangent to 
the orbit at P. Let d be the point opposite P in the orbit. 



a ~ l/R 2 . 
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kt 




(I 



The movement of P to b can be broken into two components. The first is 
Pc, which is due to the momentum of the planet at point P and is therefore 
approximately equal to vt. We know that v = where k is a constant, so 

Pc is approximately The other component of the motion is cb, which is 
due to the acceleration a directed at S. We suppose that this acts as a single 
impulse at point P directed at S, which is why cb is parallel to PS. This 
impulse has magnitude \at 2 by the previous theorem. Since LPdb and LbPc 
subtend equal arcs, they are equal. Furthermore, iPcb and iPbd are both 
right angles. Thus, APdb and AbPc are similar, and 

(3 - 3) ~x~ = JT/zja^ ^ a = W 2 

Now, we let b approach P so that all of our approximations become accurate. 
In doing this LbPc — > 0, so that Jr — ► 1- Thus, we can replace x by 
to obtain a — J^. Thus, a ~ 1/R 2 . □ 

Eureka! Experimental data in the form of Kepler's third law has led us 
to guess that the acceleration keeping the planets in their orbits at any point 
is proportional to 1/R 2 , where R is the distance to the sun. We will say that 
such an acceleration(or, equivalently, a force) satisfies an inverse square law. 
Now that we have this clue, let's play around with a bit more data to see 
what happens. There is an orbit which is closer to us than the orbits of the 



Pc 



kt 



in (3.3 
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planets, namely that of the moon. Let's again approximate this orbit by a 
circle. This circle would have a radius of about 385,000 km = 385,000,000 
m. One complete orbit takes about 27.3 days = 2,358,720 seconds. By the 
same argument as in the previous theorem, if a is the acceleration keeping 
the moon in its orbit, then 

v 2 

(3.4) a = - 

v is the circumference over the time, that is ^^s'sra'o 00 '' ~ 1> 025m/s. Thus, 

(3.5) a Rj — .0027m/s 2 

V ; 385,000,000 ' 

So what does this prove? Let's just notice that the force of gravity at the 
surface of the earth creates an acceleration of about 9.8m/ s 2 , and that the 
average radius of the earth is in the ballpark of 6, 367, 000m. Thus, the ratio 
between the acceleration at the surface of the earth and 1 / (Distance from 
surface of earth to center of earth) 2 is about 3.97 x 10 14 . Furthermore, the 
ratio between the acceleration on the moon and 1 / (Distance from the moon 
to the center of the earth) 2 is approximately 4.00 x 10 14 . Eureka again! These 
numbers are almost identical, so if gravity is the acceleration a keeping the 
moon in orbit, then again we would have a ~ 1/-R 2 , where R is the distance 
from an object to the center of the earth. And if gravity keeps the moon in 
orbit around the earth, then why not the planets around the sun? It all fits 
together. 

There is one troubling objection to this last argument, however, which 
the reader may have noticed. That is, I assumed that the force of gravity 
somehow originates at the center of the earth, which is a hard assumption 
to justify. In fact, in Theorem [6] this same assumption was made as well, as 
we treated the objects as points without taking into account their size. In 
the case of Theorem [6] we can perhaps argue that the radii of the orbits is 
so large compared to the radii of the objects that we may well treat them as 
points, but in the more recent argument this is less convincing. How do we 
get around this? 

Happily, this objection occurred to Newton as well, so we need only con- 
sult Principia. We begin by assuming that all pieces of matter are accelerated 
towards all other pieces of matter in an inverse square law. To be precise, 
if the masses of two small chunks of matter 0\ and O2 are mi and wi2, and 
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the distance between them is r, then 0\ undergoes an acceleration of 
towards O2, and O2 undergoes an acceleration of towards 0±, where g 
is a (very small) constant. The ensuing theorem shows that with spherical 
objects we can assume that all the mass is concentrated at the center of the 
object. Thus, the assumption we made above causes no difficulty. Before the 
theorem, let's prove a lemma that we'll need. 

Lemma 2 Let V be a very thin ring on the surface of a sphere centered at 
a point a. Then the surface area of the ring is approximately 2nxw, where x 
is the distance from the ring to the radius of the sphere through a, and w is 
the width of the ring. 

Proof: The two pictures below represent the same scenario, with the 
second one being a cross section directly from the side. 




We want to prove that the area of V is about 2ir(mi)(ih). Since ih is 
very small, we can approximate it with a straight line. In doing so, the ring 
is approximated by a piece of the top surface of a cone. Here is the cross 
section of that cone. 
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The area of the top surface of a cone is given by irrs, where r is the radius 
of the base, and s is the distance from the vertex of the cone to the outside 
of the base. Thus, in the picture above, the area of the ring around the cone 
determined by ih is 

(3.6) n(mi + z)(oi + y) — n(mi)(oi) = n((mi)y + (oi)x + yz) 

Now, y and z are both extremely small, so that yz is very small indeed, much 
smaller than (mi)y and (oi)x. As such, we'll ignore that term. Furthermore, 
by similar triangles ^ = |. We see that the area of the ring is approximately 
2ir(mi)y = 2ir(mi)(ih), which is what we set out to prove. □ 

Theorem 7 Let O be an unmoving spherical object of uniform mass and 
center S , and let P be a particle outside O. Suppose that P undergoes an 
acceleration a towards every particle V in O, where a = with m v the 

r v 

mass ofV and ry the distance between P and V . Then the overall attraction 
exerted upon P by O is inversely proportional to PS 2 . 

Proof: Let P and p be two identical particles places at different dis- 
tances from O. Let us suppose first that O is not actually a solid sphere, but 
is instead a very thin spherical shell. We draw two similar diagrams, corre- 
sponding to P and p, where the point labels in the first are all capitalized 
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versions of the lower case labels in the second. These diagrams represent 
cross sections of O. 




We will show that the ratio between the accelerations exerted on P and 
p by O is proportional to j^. Begin by drawing a line PIL, cutting O at / 
and L, and draw another such chord PHK such that IIPH is very small. 
On the other diagram, draw lines pil and phk such that arc il = arc IL, 
and arc hk = arc HK. Let SE and se be perpendicular to IL and il, SD 
and sd perpendicular to HK and hk, and Ji? and ir perpendicular to PK 
and pk. We will consider the effect on P due to gravity of the ring created 
by rotating HI around the axis AS. Given a little piece of this ring, the 
acceleration on P would be a constant k times the area of the piece(which is 
proportional to the mass of the piece) divided by the distance squared. This 
distance between P and the ring will be approximately the length of PI for 
any point on the ring, so we'll just use that for the distance. The total area 
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of the ring is about (MI) (IH) by Lemma [2j but the force exerted by the 
ring on P is not ^rpjMr^ ■ Resolve the acceleration along PI into the sum of 
an acceleration along PM and an acceleration along MI. The acceleration 
along MI will be canceled by a corresponding acceleration from the bottom 
of the ring. Thus, all that we need consider is the acceleration along PM. 
The ratio of this acceleration to the acceleration along PI is ^yf = Now, 
since LLPK is very small, PE and PF are nearly equal, so we can replace 
this ratio with and it follows that the total acceleration upon P from 

the ring formed by IH is proportional to ^r^mT^nf^ • Of course , the same 
argument shows that the acceleration upon p by the ring formed by ih is 
proportional to k ^ly^y^ ■ The first thing we need to show is that 

(3.7) |^ 





\ PF 


\ (pi) 2 . 


) PS 


({ih)(iq)\ 


,e£ 


\ (Pi) 2 J 


ps 



Note that ^ = j~ by similar triangles, and similarly H = %. Thus, 
(3.8) 



PF — DF U J on-uiiai tiituigico, diiu oiimiaLiy ^ — y { . 

(PI)(pf) (RI)(df) 



(PF)(pi) (DF)(ri) 

We can argue that DF and df are nearly equal, though, as follows. Arcs 
HK and hk are equal, as are IL and il. Furthermore, since LLPK and Llpk 
are very small, HK and hk are nearly centered in IL and il. Thus, 

(3.9) DF ^ DS -ES = ds-es^ df 

So we can replace -gfe with 1 to get 

(PI)(pf) (RI) 



(3.10) 



(PF)(pi) (ri) 



HI and fti are very small, so they are essentially straight lines. Thus, = 
< " I j, I l sm [ / '/ R ! I .P ■ Now, IRHI covers arc IHK, and Lrhi covers arc ihk. These 

(ni)sm(Lrht) 7 

arcs are nearly the same, so that sm(LRHI) pa sin(Lrhi). Thus, we may 
replace with , and we obtain 



(3.11) 



(ri) (hi) 

(PI)(pf ) _ (HI) 
(PF)(pi) (hi) 
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File this equation away for now. By similar triangles, we have = , 
hence ^ = ¥1. Likewise — = — , but since SE = se we can write — = ^ 

PS SE ps se' ps SE 

instead. Multiplying the ratios together gives 



(3.12) 



(PI)(ps) _ (IQ) 
(PS)ipi) (iq) 



Now take the product of (3.21) and (3.22). This gives 

(IH)(IQ) 



(3.13) 
Thus 

(3.14) 

And, finally 
(3.15) 



(piy(pf)( P s) 

(pi) 2 (PF)(PS) " {ih){iq) 



(pf)ips) 
(PF)(PS) 



(PS) 2 

(ih)(iq) 
(ps) 2 



(psy 



(PF)(PS)(^) 



( (IH)(IQ) \PF 
\ (PI) 2 J PS 
( (ih)(iq) \pf 
\ (pi) 2 J PS 



which is what we wanted to show (recall (3.7)). This proves that the acceler- 
ations on P and p by the rings generated by revolving HI and hi around SP 



and sp are in the ratio 



(ps) 2 
(PS) 2 



A similar argument shows that the same holds 



of the rings generated by KL and kl as well. We can divide O into a large 
number of very thin rings. We then have the property that, for any ring V 
determined by arc HI there is a unique ring v determined by the arc hi with 
IP HI = Lphi and such that the ratio of the accelerations exerted by V on P 
and by v on p is 



(pgp . Adding all of the rings together shows that the ratio 



of the accelerations exerted by O on P and by O on p is 



(ps) 2 

(psy< 



as well. Recall 



that this was all done for a hollow shell O. If O is a solid sphere, however, 
we just think of it as the sum of a large number of thin, hollow shells. The 
ratio for the accelerations from each of the hollow shells on P and p is /p^ 2 , 
and when we add all of them together that ratio persists. □ 

Having dispensed with that difficulty, let's find others to worry about. 
The last few theorems give convincing evidence that there is an acceleration 
called gravity between all chunks of matter, inversely proportional to the 
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distance between the chunks squared, which keeps all of the orbits going. 
But since I can't just leave well enough alone, I'm going to pile some more 
evidence on. Above we've simplified in every case by assuming circular orbits 
instead of elliptical, because ellipses are difficult. Now it's time to take on 
the elliptical orbit. First, three pretty lemmas about ellipses. 

Lemma 3 Let E be an ellipse with foci a and b and major axis Om. Let 
p be a point on the ellipse as shown below. Choose c' on ap and c on pb 
extended so that c'Oc is parallel to the tangent to the ellipse at p. Then 
Pd = Pc = Om. 




Proof: In the previous section it was shown that lapx = Lbpy. Since 
c'Oc is parallel to xpy, this implies Lpc'O = IpcO, so that triangle cpd is 
isoscoles. Thus, dp = cp. Choose b' on ap so that bb' is parallel to the 
tangent at P. Triangles abb' and aOc' are similar, so ^ — ^ = 1. Thus, 
ad = c'b' = cb. We have 



cp + dp = cb + bp + pd = da + pd + pb = pa + pb = am + bm = 20m 

The second to last equality is due to the fact, proved in the previous section, 
that the sum of the distances from the foci of an ellipse to the points on the 
ellipse is a constant. The result follows. □ 
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Lemma 4 Let E be an ellipse with axes AC and BC . Let P be a point on 
the ellipse, and let DK be the line throught the center C of the ellipse parallel 
to the tangent at P. Let F be on DK so that PF is perpendicular to DK . 
Then(PF)(CK) = (AC)(BC). 




Proof: This is trivial when the ellipse is a circle. For the general case, 
consider an affine transformation from a circle to the ellipse (if you don't know 
what an affine transformation is, draw a circle with a marker on a plane of 
glass and then let sunlight pass through it and strike the ground, casting a 
shadow of the circle on the ground. By altering the angle of the glass you 
can produce as the shadow an ellipse of any eccentricity. This is the required 
transformation). Such transformations preserve ratios of areas, we we can 
reduce the case of the general ellipse to that of the circle. □ 
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Lemma 5 Let GCP be a straight line through the center C of an ellipse. 
Let v be a point on GP and Q a point on the ellipse so that Qv is parallel to 
the tangent at P. Draw the line DCK parallel to the tangent at P. Then 

(3.16) - ^ 2 



(Qv) 2 (DC) 2 

Proof: Let us suppose first that the ellipse is in fact a circle, and let's just 
assume that GP is vertical for simplicity. 
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Clearly LGvR = IQvP, and we also have LGRv = iQPv and LvGR = 
LvQP as well, since these pairs of angles cover the same intervals on the circle. 
Thus, triangles Cvr and QvP are similar, and ^ = hence (Qv)(vR) = 
(Gv)(vP). Since Qv = vR, we see 



(3.17) 



(Gv)(vP)(DC) 2 (Gv)(vP) 



(Qv) 2 (PCf 



(Qv) 



so that ( |4.39D holds. 

How do we go from a circle to an ellipse? Lesson learned from the previous 
lemma, we project the circle with an affine transformation. Choose a circle 
which projects to the ellipse in question, and choose points G', D', Q', P', K', C, v' 
on and in the circle which project to points G, D, Q, P, K, C, v on and in the 
ellipse. 
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From the work above, we know that 

,, lf * {G'v')WP'){D'Cr _ 1 

1 ' (QV) 2 (P'C) 2 

A property of this type of transformation is that, if a'b' and c'd' are two line 
segments which lie on parallel lines and which project to line segments ab 
and cd, then £| = g. Thus 



GV v'P' Gv vP 

P'C P'C _ PC PC 

(Q'v'Y 2 (Qv) 2 

(D'C) 2 {DC) 2 



(3.19) 1 
Thus, 

(d Uj (Qt;) 2 "PC) 2 

and we are done. □ 



Now let's look at one of Newton's theorems on elliptical orbits. 

Theorem 8 Suppose that an object O moves in an elliptical orbit due to 
an acceleration a towards an unmoving object S which depends only on the 
distance R between S and O. Then a ~ ■ 
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Proof: Here is the diagram that appears with this theorem in Principia. 



B Q\R 





y- / 1 \ 
y / \ / \ 


\ s y 


V 7^ -* 

\/ H J 

f\ / 



A 



S and if are the foci of the ellipse. PF is perpendicular to DK, and QT 
is perpendicular to PS. O resides at P, and Q is a point very close to P. 
PF is perpendicular to DK. In the style of Newton, we are going to collect 
a bunch of relations between ratios, then multiply them together. To begin 
with, it will be convenient to define L = ^ttP-. The first relation is 



(3.21) 



L(QR) QR PE AC 



L(Pv) Pv PC PC 

The first equality is obvious, the second equality is because triangles Pxv 
and PCE are similar, and the third is Lemma |3j The second relation is 
obvious. 



(3.22) 

The next is Lemma [5] 
(3.23) 

Next we have 



L(Pv) L 
{Gv){Pv) ~ Gv 

(Gv)(vP) (PCf 



(Qv) 



(Dcy 
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(3.24) 



(Qv) 



(Qx) 2 

This is only approximately true, but as Q gets very close to P, IQPS ap- 
proaches IRPS, and it follows from this that — > 1. Next, we have 



(3.25) 



(Qx) 2 _ (EPf _ (CAf _ {CD) 2 
(QT) 2 ~ (PF) 2 ~ (PF) 2 ~ (CB) 2 

The first equality is because triangles EPF and QTx are similar, the second 
is Lemma |3j and the third is Lemma |4} We now form a new equation with 
the relationships (3.21 )-(3.25). The left side of the equation is formed by 



multiplying the leftmost parts of ( 3.21[)-(|3.25 ), and the right side is formed 
by multiplying the rightmost parts of (3.21 )-(3.25). Much cancelation occurs 
on the left, and we get 



(3.26) 



Plugging in L 



L(QR) L(AC)(PC) 2 (CDf 



2{BC) 2 



AC ' 



(QT) 2 (PC)(Gv)(CD) 2 (CBf 
this is 



(3.27) 



L(QR) 2(CB) 2 (PC) 2 (CD) 2 2(PC) 



(QT) 2 (PC)(Gv)(CD) 2 (CB) 2 (Gv) 
As Q approaches P, Gv — > 2PC, thus 2 \^y — > 1, so we may take 

L(QR) 1 



(3.28) 



(QT)' 



Multiply the equation L(QR) = (QT) 2 by fg^ to get 



(3.29) 



L(SPf 



(SPf(QTf 
QR 



Since L is a constant depending on the ellipse, we may write this as 



(3.30) 



(SP) 



2 (SP) 2 (QT) 2 
QR 



Now, suppose Q is the point that occurs in the orbit some time t later than P. 
For Q close to P the area of sector SQP is very close to the area of triangle 
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SQP, and thus by the corollary to Theorem [4] we have t ~ (QT)(SP). Also, 
by Theorem [5] QR is approximately (l/2)at 2 . Thus, we get 

(3.31) (SP) 2 ~ ^ = \ 

Since 5P = R, the distance from S to O, we see a ~ 1/-R 2 , and we are done. 

□ 

This section should have convinced the reader that a force directed at the 
sun satisfying an inverse square law is a likely culprit for moving the planets 
in their orbits. 

4 Proof that conical orbits result from an in- 
verse square law 

We now will assume that we have an inverse square law force acting on the 
planets, and deduce consequences. 

Theorem 9 Suppose that an object O moves along a curve and undergoes 
an acceleration a towards an unmoving object S which depends only on r, 
the distance between O and S . Suppose that at some time O is at a point 
Pi, and that at some later time O is at another point Pi. For any point 
r on SPi, draw the line from r perpendicular to SP± with length equal to 
the acceleration a(r) at distance r. The points a(r) trace out a curve. Let 
r% = Pi, and let r 2 be the point on SPi so that SP2 = Sr 2 . Let v(P) be the 
velocity of O at point P. Then vi^Pi) 2 — v(Pi) 2 = 2A, where A is the area 
determined by the curve a(r) and the line SP, between the points ri and r 2 . 
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Proof: Divide the interval [r 2 , r±\ into N equal parts, [xn, xn-i], [xn-i,xn-2], 
[x2,xi], [xi,xo], where xn = f'i and x^ = r\. The following gives an exam- 
ple, with N = 4. 

T, 




Let Kq = Pi, Kjy = i"2, and Ki be the point on the curve P1P2 the 
same distance from S as x^. Draw the circular arc between Ki and Xj with 
center at 5, and let ij be the point on this arc which also lies on SK^i. 
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When N is very large, the segments ifj_iif, will be approximated as straight 
lines. Let Tj be the point on ifj_iifj so that l{Ti is perpendicular to ifj_iifj. 
We will approximate by assuming that the acceleration towards S on O is 
a(xj_i) when O is in the interval iQifj_i. This acceleration is along the 
line SKi_i. Resolve the acceleration into perpendicular components, along 
Ki_{Fi and TJi. The acceleration along TJi does not change the velocity, 
only the direction. Thus, the change in velocity comes from the acceleration 
along Ki-iTi. Since AK^KJi ~ AiQ_iijTj, 

^ ^ Ki—\Ti Ki—ili Xi—\Xi 



Ki-Ji K^Ki K^Ki 

Thus, the acceleration along K^K^i is a(xi-i)( ^=^ j = 

If tj is the amount of time it takes O to travel from ifj_i to ifj, then 
v(Ki) — v(Ki-i) = a(xi-i) ^ is given by the distance between 

if i_i and ifj divided by the velocity over this interval, which is approximately 
v (ifj_i). We see that 



(4.2) .(if,) - ,(if^) = a^) %^ 

v if,_iif/ v(K i - 1/ 

Thus 



(4.3) .(if^i)^^) - .(ifi-i)) = a(x i )(x t . 1 x l ) 

The right side is equal to the area of one of the rectangles in the picture 
above. We see that if we add this expression for all i, the right hand side 
becomes approximately equal to A. Let's multiply by 2 just for good measure 
to get 

N 

(4.4) 2Y,v(K i - 1 )(v(K i )-v(K i - 1 ))=2A 

i=i 

Now for a bit of trickery. If we choose N to be very large, then v{KA and 
i>(ifj_i) will be very close to each other, so we may write 

N 

(4.5) 2j2v(K i _ 1 )(v(K i )-v(K i _ 1 )) 

i=i 

N 

i=i 
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TV 

= EM^) 2 -^-i) 2 ) 

i=l 

This last sum is a telescoping sum: 



(4.6) {viK,) 2 - v(K ) 2 ) + {v{K 2 f - t;^) 2 ) + 

. . . + (v(K N ^) 2 - v(K N „ 2 ) 2 ) + (v(K N ) 2 - v(K N ^) 2 ) 

The total is v(K N ) 2 - v(K ) 2 = v(P 2 ) 2 - v(Pi) 2 . Therefore, 

(4.7) w(r 2 ) 2 - v(rx) 2 = 

which is what we set out to prove. We made many approximations above, 
but as iV becomes very large the approximations become more and more 
accurate, so that in the limit we get equality. □ 

In light of the work we did in the previous section, we need to be able to 
calculate the area under the curve given by a(r) = \, with k a constant. 

Proposition 10 Given the setup in the previous theorem, let a(r) be given 
by tt for some constant m > 0. Then A = m(- — — ). 

Proof: Fundamental theorem of calculus. See also the note at the end 
of the paper. □ 



Combining Theorem[9]and Proposition 10 shows that, if we start an object 



in motion at a distance r a from S with velocity v(r a ), then for any other 
distance r from S that the object attains we have v (r) 2 — — = v{r ) 2 — —. 
Let's isolate this as a lemma for future reference. 

Lemma 6 If an object O is in motion about a motionless object S subject 
only to an acceleration of ™ towards S , then 

(4.8) v(r) 2 - — = C 

where C is the constant v(r Q ) 2 — y 1 . 
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Corollary 1 from the previous section implies that once an object begins 
orbiting S in, say, a counterclockwise direction, it continues to orbit counter- 
clockwise. That is, it can not reverse itself at some point and orbit clockwise. 
We'll assume from now on that any object O in orbit around S is orbiting 
counterclockwise. Let a be the angle between SO and the tangent to the 
curve that the object traces; this gives two choices for a, but we will see 
below that it doesn't matter which we choose, since the important quantity 
will be the square of the sine of the angle. 




S 



Lemma 7 For an object O in orbit about a motionless object S with the 
same assumptions as in Lemma\^ we have 

(4.9) r 2 v{r) 2 sin 2 a = Q 

where Q is a constant. 

Proof: Let O' be a point on the orbit of O close to O. Complete the 
triangle 00' S, and let a' be the angle between SO and 00', a' be the angle 
between SO' and O'O. Let t be the amount of time O takes to travel to O'. 
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v(r) t 




By Corollary 1 of the previous section, the area of sector SO'O is Wt, for 
some constant W. Triangle SO'O has about the same area as sector SO'O, 
and 00' is about v (r)t, so we approximate 



(4.10) 
hence 



-(SO')(v(r)t) sin a' = Wt 



(4.11) 



'-(SO')v(r) sin a' = W 



As we let O' go to O, the approximation becomes equality, SO' — > r, and 
a' — > a. Furthermore, lO'SO — > 0, so sin ce LP ' SO + a' + a' = 180°, 
a' — > 180° — a. Substituting these limits into (4.12) and using the fact that 
sin(180° — a) = sin a, we get 



(4.12) -rv(r) sin a = W 

Squaring this equation and letting Q = AW 2 completes the proof. □ 

These lemmas combine to give us 

Theorem 10 An object O in motion about an object S which is constantly 
subject to an acceleration of ™ in the direction of S satisfies 
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(4. Id) esc a = —r + ——r 

Where C =, Q, and m are constants given earlier in this section, and a = 
a(r) is the angle between SO and the tangent to the orbit at O. 

Remark: So that we have the constants all in one place, m = \ which 
is assumed to be constant, C = v 2 — and Q = r 2 t> 2 sin 2 a , where r a , v 
and a are the initial distance of O from S, velocity of O, and angle that O 
is traveling from the radial line to S. 

Proof: Rewrite this as v(r) 2 = , ^ 2 and rewrite the conclusion of 

■ | \ / sin a 

Lemma 6 as v(r) 2 = C + — . Combining these equations gives 



(4.14) ^ = C+ ^ 

r z sin a r 

2 

Multiplying both sides by ^ gives 

(4.15) esc a = —r + ——r 



Recall from section 1 that conies satisfy 



□ 



9 (e 2 - 1) 2 2 
(4.16) esc 2 a = \. .V 



e"\aor eiao 



We're clearly getting close, as (4.15) and (4.16) look to be pretty much the 
same equation. There are a few more technical details to deal with before 
we can conclude that objects must move in conic sections under the effect of 
an inverse square law. The following proposition is the first step. 

Proposition 11 Suppose that an object O is orbiting an object S under an 
acceleration of ^, and suppose that the orbit of O contains a circular arc 
centered at S . Then O must move in that same circle for all times in the 
past and future. 

Proof: Let OP be the circular arc in the orbit, and complete the circle 
around S. Let d be the point opposite O, let r Q = SO, let b be a point 
on OP close to O, and let c be the point on the tangent at O so that be is 
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perpendicular to said tangent. Suppose that it takes time t for O to move 
to b. We resolve the motion Ob into components Oc and cb. Oc is due to 
the velocity at O, and thus is equal to v(r )t. cb is due to the acceleration 
towards S at O, and is therefore parallel to 5*0, and is equal to ^t 2 by 
Theorem |5l 



V( Xo )t c 




' IT 


-mt 2 ^ 2 ) 










s 

















and Z60c both cover the arc Ob and are therefore equal. Thus, 
AOdb ~ AbOc, and we have 



(4.17) 



2; 

2^ 



m y.2 

= — t 2 



Asb — > O, L bOc — > 0, so that 
v(r )t in (4.17) to get 



Oc 



1. We can therefore replace x by 



(4.18) v(r ) 2 t 2 = -t 2 



v{r Q f 



rn 



Now, recall that the C in Lemma [6] and Theorem 10 is given by 

2m 



(4.19) 



C = v(r o y 



To 
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We see that in our case, C 



Now, when O is moving in a circular arc 



a 



90°, and sin 90° = 1, so the Q in Lemma [7] and Theorem 10 is given by 



(4.20) 

We know that v(r r , 



Q = v(r ) 2 r 2 
so Q = mr Q . Thus, by Theorem 



10 



(4.21) 



esc 2 a 



-m/r 2 
-r + 



mr Q 



2m 

1 

mr Q 



-r H r 



for all r throughout the orbit of O. But esc 2 a > 1 for all a, and 1 — ( j ^ L — l) 2 < 
1 for all r. We see that esc 2 a = 1 — (— — 1) 2 = 1 for all r that O can attain. 
This can only be the case if r = r , so we conclude that r = r G is the only 
possible distance between O and S. In other words, O moves in a circle of 
radius r Q forever, and must have at all times since the object was put in 
motion. □ 

Now we find the conic that O must travel upon when put in motion. 

Proposition 12 If O is placed into orbit around S, then there is exactly one 
conic that O can travel along. 



Proof: By Theorem \T0\ the relationship 
(4.22) 



esc 2 a 



C 2 2m 
— r H r 

Q Q 



persists throughout the orbit of O. If there is only one possible r which can 
satisfy this, then we are in the case covered by the previous proposition and O 
travels in a circle, which is a conic. If O does not travel in a circle, however, 



we have to find a unique conic which satisfies (4.22). In the notation of 
Theorem 2 of the first section, a conic (non-circular) is uniquely determined 
by the eccentricity e and length ao, and satisfies 



(4.23) 



esc 2 a 



e 2 (ao) 2 



e(ao) 



Equating coefficients in (4.22) and (4.23), we get 



(4.24) 



e 2 -l 


C 


e 2 (ao) 2 


" Q 


2 


2m 


e(ao) 


~Q 
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The second equation implies 



(4.25) 



e 2 (ao 2 ) 



m 
Q 2 



and plugging this into the first gives 



(4.26) 



Thus, 



(4.27) 



1) 



m 



Q 2 
QC 



c 
Q 

2 



m 



77T 



VQC + m 2 



m 



Given this, the second equation in (4.24) implies that 



(4-28) ao = Q 2 . 

\JQC + m 2 

So we see that the conic is uniquely determined, i.e. there is only one conic 



that satisfies (4.22). The reader may notice a possible problem, however. 
How do we know that QC + m 2 > 0, so that we may in fact take the square 
root? Reexamining the definitions of the constants, we see that Q > 0, but 
that C can be any real number. Thus, for arbitrary Q, C, and m it can easily 
happen that QC + m 2 < 0. What saves us, though, is that Q, C, and m are 



v(r n ) 2 r 2 sin 2 a. If 



not arbitrary. Recall that C = v(r a ) — — , and that Q 
C > then we have no problems, so let us assume that C < 0. Then 



2m 



) ( v ( r o) 2 fo s i n2 a ) + m2 



r 

^Tfl \/ / \22\ 2 

-){v(r ) r ) + m 



QC + m 2 = (y{r 
> (v(r 

= r v(r ) — 2mr v(r ) + m 
= (r v(r ) 2 — m) 2 > 

If QC + m 2 = it can be seen easily that O travels in a circle. Otherwise, 
QC + m 2 > 0, and this entire construction works to generate a unique conic 
upon which O can move. □ 
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We're almost there. We still have to prove that an object cannot travel in 
an orbit that is not a conic. That is, we need to prove that there is no other 



curve that satisfies (4.13). Try as I might, I couldn't find a geometrical argu- 
ment for this that varies notably from the standard proof that two functions 
with the same derivative and same value at a point coincide. I will therefore 
leave the proof of the following theorem to the reader. If the reader runs into 
trouble, they might find the transformation (x, y) — > (e x cosy, e x siny) use- 
ful, together with the calculus theorem alluded to earlier in this paragraph. 

Theorem 11 Suppose that an object O is placed in orbit around S subject 

to an equation of the form 

(4.29) csc 2 a = 0(r) 

Then there is at most one possible path that the object can move along which 
does not contain circular arcs centered at S. 



At long last, combining Theorem 10, Proposition 11, Proposition 12, and 



Theorem 11, we obtain 



Theorem 12 An object O subject to a force with an inverse square law di- 
rected at an unmoving object S will move along the path of a conic section. 

Let's take stock of where we are as it relates to the orbits of the planets. 
We have assumed the existence of an acceleration upon any object in the 
solar system that is directed at the sun, and which satisfies an inverse square 
law. We have proved that planets and other objects must move along conic 



sections(Theorem 12). This gives Kepler's first law, that planets move in 
ellipses(if they moved in parabolas or hyperbolas they would fly out of the 
solar system, and we wouldn't think of them as planets). Kepler's second 
law was proved at the beginning of the previous section, and in fact would 
hold for any force directed at the sun. All that remains is Kepler's third 
law, which is a snap compared to the first law. First, one last lemma about 
ellipses. 

Lemma 8 Let E be an ellipse with focus a and directrix L. Let o be the 
point on L such that ao is perpendicular to L, and let e be the eccentricity of 
E. Then the area of E is 

n(ao) 2 e 2 
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Proof: Let X and Y denote the major and minor axes of E, respectively. 
Recall that E can be thought of as the projection of a circle of radius Y, and 
that projections preserve the ratio of areas. Let us inscribe E in a rectangle 
S', and inscribe a circle C of radius Y in a square S. 




We know that 

Area(C) = Area(E) 
1 ' ' Area(S) Area(S') 

Since Area(C) = nY 2 , Area(S) = AY 2 , and Area(S') = AXY, we see that 
Area(E) = irXY. Now, by the last proposition in the section on conies, we 
have 



(4.32) X = n ., 



(ao)e 



fao)e 

(4.33) F 



Plugging these identities into Area(E) = nXY gives the result. □ 

That exponent of 3/2 in the denominator in this lemma sure is suspicious, 
isn't it? Now for Kepler's third law. 
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Theorem 13 Let an object O orbit a fixed point S in an ellipse E subject 
only to a force directed towards S which satisfies an inverse square law. Let 
X be the major axis of the ellipse. Then X 3 / 2 ~ T, where T is the length of 
time for O to revolve once around S . 

Proof: Suppose the object is set in motion at time with initial velocity 
v Q , radius from S r Q , and with angle between tangent and radius a . If the 
planet moves a very short amount of time t to point P it will sweep out an 
area that is very close to a triangle, as below. 

P 




Since t is so small, I SOP is approximately equal to a, and OP is approx- 
imately v t. SO = r Q , so the area of triangle SOP is roughly tr v sina. We 
will take this as the approximation to the area of sector SOP. Since equal 
areas are swept out in equal times, this relationship persists throughout the 
duration of the orbit. That is, if the planet travels for a length of time T, 
the radius to S sweeps out the area Tr v sin a. Thus, to find the amount of 
time in one revolution of O about S we may set 

(4.34) Tr v sin a = Area (E) 
and solve for T. From the previous lemma, 

. . 7r(ao) 2 e 2 

(4.35) Area(E) = (1 1 _ j 2)3/2 
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The proof of Lemma 12 shows that 
(4.36) e 



VQC + m 2 
m 

Q 



(4.37) ao =-/7Tf< t 

\JQC + m 2 

with constants Q, C, and m as defined earlier in the section(Q defined in 
Lemma [7j C and m in Lemma [6]). Thus, 

f4 38) Area(E) - Q * ,Tr? - m ^ - mr ° v ° sina ° 

1 ' /irea ^) (_QC/ m 2)3/2 (_C)3/2 - (_C?)3/2 

Recall that moves in an ellipse only when C < 0, so that we may safely 
raise (— C) to a non-integer power. In light of (4.34), we have 

m 



(4.39) 
We also know that 



T 



(4.40) 



1-e 2 



(_C)3/2 

Q/m 



(QC + m 2 )/m 2 mV-C 



77. V_r7/ 



Comparing (4.39) and (4.40) shows that, indeed, X 3 / 2 ~ T. 



□ 



5 An interesting problem 

Problem: Suppose that an object O is placed in orbit around S at a distance 
r Q , an initial velocity v Q , and an initial angle a (assumed not equal to 0° or 
180°^) to the radial line from S. Suppose that O is always subject to an 
acceleration of ™ towards S. Determine which of the conic sections O will 
travel along, determine the closest distance O will attain from S, and in the 
case where O travels in an ellipse determine the maximal distance O attains 
from S. 

Remark: In the case of the parabola and hyperbola, we may need to run 
time backwards to achieve the minimum, as the object may be placed in 
motion moving away from S. 
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Solution: We know from Theorem 



10 



that, with C = v 2 - ^ and Q 



r 2 v 2 sin 2 a a , 

(5.1) esc a = —r + ——r 

From Theorem [3] we know that this represents a parabola if v 2 — ~ = 0, 
an ellipse if v 2 — y 1 < 0, and a hyperbola if t> 2 — > 0. That answers the 
first part of the problem. To deal with the rest, suppose first that we are in 
the case of a parabola. Then 

fr n\ 2 2m 

(5.2) esc a = ——r 

Since esc 2 a > 1, the minimum that r can be is ^ = r ° t, ° sm a ° . Now 

— ' 2m 2m 

suppose that O moves in an ellipse or hyperbola. Again the extremal values 
of r correspond to esc 2 a — 1, which by the quadratic formula happens when 

n , —m ± \/m 2 + CQ 

(5.3) r = 

In the proof of Proposition 12 it was shown that m 2 + CQ > 0, so this 
equation makes sense. When C > this gives one positive value for r, 
corresponding to the nearest point to S on the hyperbola, and when C < 
this gives two positive values, corresponding to the maximal and minimal 
points to S on the ellipse. Plugging the values in for C and Q gives 



(5.4) 



-m + ^Jm? + (v 2 - y^){rlvl sin 2 a Q ) 

v 2 2m 



(5.5) 



as the minimum for both the hyperbola and the ellipse, and 

— m — ^Jm 2 + (v 2 — 2 ^ 0l ){t 2 vI sin 2 a ) 



v 2 2m 



as the maximum for the ellipse. □ 

The complexity of these answers indicates that a solution by different 
methods is likely to be quite involved. 
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6 Notes, references, and further reading 



1. The prevalence of r 2 and sin 2 a in the formulas in this paper remind 
me a bit of rational trigonometry, as propounded by Norman Wildberger. 
Essentially this is trigonometry with the fundamental concepts being the 
squares of lengths and squares of sines of angles. Perhaps many of these 
theorems could be reworked and would have nicer proofs and statements in 
that framework. I haven't worked on it myself, but an interested reader 
might want to consider it. I'm not sure how something like Theorem [3] would 
fit in, given the presence of a linear term in r. 

2. The proof of Proposition [9] is based on a technique that, to my knowledge, 
was discovered by Japanese mathematicians a few centuries ago. I learned 
of it from |2j, which is highly recommended. 

3. The Fundamental Theorem of Calculus was invoked only once, in the 
proof of Proposition [TUJ But this could have been avoided if one is in a truly 
classical frame of mind, by a simple argument which is similar to the proof 
of Theorem [5] 

4. The May, 1994 issue of The College Mathematics Journal contains a very 
interesting discussion on the question of whether Newton proved the theorem 
that an inverse square law implies conic section orbits. See also [5]. 
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